Automatic language identification

نویسندگان

  • Marc A. Zissman
  • Kay M. Berkling
چکیده

Automatic language identification is the process by which the language of a digitized speech utterance is recognized by a computer. In this paper, we will describe the set of available cues for language identification and discuss the different approaches to building working systems. This overview includes a range of historic approaches, contemporary systems that have been evaluated on standard databases, as well as promising future approaches. Comparative results are also reported.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Incorporating linguistic knowledge into automatic dialect identification of Spanish

Automatic dialect identification, like automatic language identification , has often been approached through the use of phonetic frequencies and phonetic sequence modeling. While such statistical systems perform well on language identification problems, they are less adept at the more difficult problem of automatic dialect identification, particularly on short segments of speech. In this paper ...

متن کامل

Kohonen Self Organizing for Automatic Identification of Cartographic Objects

Automatic identification and localization of cartographic objects in aerial and satellite images have gained increasing attention in recent years in digital photogrammetry and remote sensing. Although the automatic extraction of man made objects in essence is still an unresolved issue, the man made objects can be extracted from aerial photos and satellite images. Recently, the high-resolution s...

متن کامل

Text - Based Automatic Language Identification

— We present a statistical approach to text-based automatic language identification that focuses on discrimination between as opposed to representation of different language models. The system is evaluated on a text corpus containing six African and six European languages.

متن کامل

Automatic identification of language varieties: The case of Portuguese

Automatic Language Identification of written texts is a well-established area of research in Computational Linguistics. Stateof-the-art algorithms often rely on n-gram character models to identify the correct language of texts, with good results seen for European languages. In this paper we propose the use of a character n-gram model and a word n-gram language model for the automatic classifica...

متن کامل

An approach to automatic figurative language detection: A pilot study

This pilot study explores a new approach to automatic detection of figurative language. Our working hypothesis is that the problem of automatic identification of idioms (and metaphors, to some extent) can be reduced to the problem of identifying an outlier in a dataset. By an outlier we mean an observation which appears to be inconsistent with the remainder of a set of data.

متن کامل

From perceptual designs to linguistic typology and automatic language identification : overview and perspectives

This paper deals with the overview of the methods in perceptual language identification and the suggestion of a new approach based on a two-step methodology integrating to perception “genetic” considerations and resulting into the modeling of perceptually identified discriminative cues. The first study reported here concerns experimental designs for perceptual and automatic identification of th...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • Speech Communication

دوره 35  شماره 

صفحات  -

تاریخ انتشار 2001